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OBJECT PERSISTENCE: BEYOND SERIALIZATION 

by Timo Salo, Justin Hill, Scott Rich, Chuck Bridgham, and Daniel Berg 
Our authors describe techniques and frameworks necessaryto successfully implement scalable 
object persistence for complex database systems. Much of the technology they examine has been 
incorporated in development tools ranging from VisualAge for Java, to EJB tools for WebSphere. 

JAVA PROXIES FOR DATABASE OBJECTS 

by Paul Upton .... .... 

Java proxy technology lets you define database object schema using the database ODL To illustrate 
how such a technology might be implemented, Paul provides examples based on the Jasmine ; ; 
object-oriented database. V. . ' 

VBSCRIPT AND SQL CALENDARS 

by John Donovan Lambert 

John presents the VBScripts he uses for inputting SQL results into a web calendar, and oteaJSsesS; 
how you can port these scripts to Java, Perl, Cold Fusion, or whatever language you prefer: . 

THE CVS DAIA FORMAT 

by Cesar A. Gonzalez Perez 

The CVS data format stores cartographic data for a specific geographic area Into a single file. Cesar 
examines the format, then presents a tool for converting CVS files into DXF fpfrnat 

AGENT ITINERARIES . v - \ 

by Russell P. Lentini, Goutham P. Rao, and Jon N. Thies 

Instead of examining itineraries in the traditional way as a list of tasks to be performed by agents/\ 
our authors treat itineraries as a metaprogram— a way of programming ah agent and inadvertently 
its goal. To illustrate, they'll present an itinerary that performs a database query. 

JAVA AND DIGITAL IMAGES 

by David H. Martin and Johnny Martin 

Capturing, storing, and retrieving images is an often- overlooked i&mxz that many applications 
could benefit from. David and Johnny describe "Grabber for Java," an API that encapsulates the 
functionality necessary for video capture. . v 
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THE SPARK REAL-TIME KERNEL 

by Anatoly Kotlarsky 

SPARK, short for "Small Portable Adjustable Real-time Kernel," is a royalty-free, fast, tiny, 
portable real-time kernel. Anatoly describes how he used it to build a video bar-code 
scanner. ' . 
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AUTOMATED TESTING FOR WEB AITUCATIONS 

by M. Selvakumar i : : 

The technique for automated web-user-interface testing presented here is based on HTML, 

JavaScript, and CGI, and implemented for Netscape Communicator 4.04 and Apache 1.2. 
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THE VERSION CONTROL PROCESS 

by Aspi Havewala 

Source-code version control is a set of working rules for code sharing that lets 
developers modify files in an exclusive way. As such, it is one of the most 
important, yet least understood, areas of software development. 
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Increasing productivity 
and reducing maintenance 

Tirno Solo, Justin Hill, Scott Rich, Chuck Bridghom, and Daniel Berg 



■ ■ ost commerciaL high- volume databases are based on ei- 
llll ther the relational .or service paradigm (that is, databas- 
1111 es encapsulated within transaction processing monitors). 
1 V I Persisting objects, in these nonobject- oriented databases 
is a major challenge when building large-scale applications. 

On a small scale, object persistence is easy to solve. Seri- 
alization; for example, has been presented as a method for 
providing simple object persistence. However, scaling up in- 
troduces a new set of requirements. Many enterprise object 
systems involve object models with complex inheritance hi- 
erarchies and large numbers of object relationships. The run- 
time configuration often includes multiuser databases that can 
be both relational and nonrelational. The object model and 
database model are often designed by different groups of peo- 
ple, therefore requiring a loose coupling between the mod- 

The authors are software engineers working in IBM's Visual- 
Age Features Development group. They can be contacted at 
tjsalo@us.ibm.com. 



els ' The design of a scalable object persistence framework 
must adequately address issues related to performance with 
complex object models, support for complex object transac- 
tions, transformations from object inheritance structures and 
associations to native database structures, translating object 
queries to native database queries, and accessing objects across 
multiple database paradigms. 

There are several standards and specifications related to ob- 
ject databases and object persistence, including the Object Man- 
agement Group (OMG) Standard, Object Database Manage- 
ment Group (ODMG) Standard, and Enterprise JavaBeans (EJB) 
Specification. However, none of these specifications address 
the actual implementation of a persistence engine. At best they 
describe interfaces and high-level components that form the 
API of the system. 

In this article, we'll describe techniques and frameworks re- 
quired to successfully implement scalable object persistence 
for complex systems. We'll address topics such as required 

(continued on page 22) 
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Figure Ir High -level architecture for a persistence framework, 
(continued from page 19) 

metainformation, read-ahead and caching, queries, object as- 
sociations, and concurrent and nested transactions. We have, 
pioneered these techniques for almost 10 years in many large- 
scale projects. Various aspects of the technology we describe 
have been incorporated in IBM development tools, including 
VisualAge for Java (Persistence Builder), VisualAge for Smalltalk 
(ObjectExtender), and EJB development tools for WebSphere. 

General Architectures 

Persistence frameworks typically consist of two high-level 
components: the development- time toolkit and the run-time 
persistence engine. Figure 1 is an example of high-level archi- 
tecture for a persistence framework. 

The development toolkit usually includes tools for collecting 
metainformauon about the object model and database, and tools 
for generating business object classes and database queries. 

There are two approaches for implementing the run-time 
engine. One approach is to have the metainformation avail- 
able at run time, and generate the queries for retrieving ob- 
jects on- the- fly as the application traverses various object re- 
lationships. This approach makes it possible to build dynamic, 
flexible applications that have no navigation restrictions with- 
in the object model. However, the amount of memory used 
by the metainformation and run- time query generation usu- 
ally results in poorer performance. Another approach es to 
generate the queries at development time. Little explicit metain- 
formation is needed at run time with this approach. Execu- 
tion of the generated queries is faster, because run-time in- 
ferencing is not needed and the queries can often be optimized 
for the database. The drawback is that the object model traver- 
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Figure 2: Relationships between various metamodels. 



sal paths are fixed. If more paths are needed, more queries 
need to be generated and compiled. : . 

Metainformation 

Metainformation for an object persistence framework indudWj 
information about the application's objectihociel, ; .the: target^ 
database's data model, and the queries needed to service the 
application. As Figure 2 shows, the information is often grouped 
into the following models: 

• The data model for describing the relevant subset of the 
database schema. \ \- -\ f ■ - 

• The persistent object model for describing the persistent com- 
ponents of the business domain model. 

• The mapping model for describing the mapping between the , 
objea model and the data model 

How much detail is captured and whether the metainforma- 
tion is partitioned in one large model or various separate sub- 
models depends on issues of flexibility, efficiency, and expres- 
siveness. Therefore, there is no single correct way to package 
the information, but all the following must be captured in some 
form somewhere in the framework. . ' ■ : 

The data model represents the logical view of me database; It- • 
is a subset of the tables, views, and columns in the database schema ^ 
that are relevant to object systems/This includes information on 
entity qualifier names, logical and physical names of entities, col- 
umn datatypes, and conversions from database types to object lan- 
guage types. Further refinements could mclude information on 
database column functions such as sums and averages. - 

The data model can be augmented with information that is 
not explicitly kept in the database schema. For instance, the 
relationships implicitly defined by the foreign- key references 
in the schema can be modeled as first- class connection ob- 
jects in the data model. Enhancing the data model with con- 
nections makes the mapping of object associations to database 




Figure 3: A structural data model. 
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( continued from page 22) 

relationships a significantly simpler task. Figure 3 presents an 
enhanced data model (the structural data model) 

It is not absolutely necessary to have a separate data mod- 
el. However, without such a model, much of this information 
must be captured in the mapping model, thus overloading its 
behavior and state. 

The persistent object model is a subset of the application's 
object model It represents only the portion of the business ob- 
ject model that requires persistence behavior. It can be a sub- 
set of classes within the complete business object model and a 
subset of the instance variables within a single class. The al- 
lowed types for the attributes can also be captured for valida- 
tion purposes. 

Besides modeling the simple attributes, the associations be- 
tween the persistent objects can also be modeled. This makes the 
object model independent from the mapping model, allowing a 
clear mapping between the foreign-key relationships in the data 
model and the object assertions in the persistent object model. 

The definition for the object identifiers can be captured in the 
object model rather than in the mapping model, again allowing 
simple mapping between the primary key column(s) in the data 
model and object identifier in the object model. : 

The persistent object model is optional, and much of the in- 
formation that it provides can be . held in the mapping model. 
However, without the object model (as well as without the data 
model) there is a risk of overloading the behavior and state of 
the mapping model. r • 

The most minimal system that would be of any interest re- 
quires at least a model of mapping between the object struc- 
ture on one side and the target database structure on the 
other. The mapping model contains the essential instructions 
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to the system of where the data retrieved from the database 
is to be placed in the objects. The mapping model must de- 
fine which object class corresponds to which table and which 
object attributes correspond to which columns. Refinements 
could include mapping one object to multiple tables and one 
instance variable to multiple columns, conversions of column 
data from primitive types to higher-level object types, and 
defining which columns act as database- conflict- detection 
predicates. Figure 4 shows examples of class- to- table map- 
ping schemes. 

If associations are to be supported transparently, then the map- 
ping must also define which foreign-key relationship corresponds 
to which object association in the object model. Figures 5 and 6 
illustrate various relationship- mapping schemes, 

Finally, if inheritance is supported then the mapping model 
should capture all such information. This would include the type 
of inheritance employed in the database, type discriminator val- 
ues for choosing the appropriate class, and/or foreign-key re- 
lationships between tables. Figure ? shows examples of inheri- 
tance mapping schemes. y . ^ ; - . 

Cache \- 

Various read-ahead and caching strategies can improve a per- 
sistence framework's efficiency and flexibility. Without read- 
ahead and caching capabilities, the application is alwaySiStarved 
for data, parsimoniously reading from the database as associa- 
tions in uhe persistent object model are traversed and bringing 
back data only one level at a time. With an object model that , 
has many relationships, this can cause a large number of ex- 
pensive database roundtrips.; / , : • ••■ v - ' 
A read-ahead scheme lets the application minimize the num- 
ber of database roundtrips by retrieving large object composition 
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Figure 5- 1:1 association mapping schemes. 
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trees within one query. Read-ahead involves instantiating the 
requested objects and caching the data for their related objects, 
thereby making sure that the data is present for the objects that 
are most likely needed next by the application. How far ahead 
objects are read is determined by application requirements. Flex- 
ibility is gained as the queries can be tuned without affecting the 
structure or workflow of the application. 

Reading objects ahead often results in too much data. There- 
fore, it is desirable to keep the data in binary format to delay 
or avoid the performance cost of instantiating unused objects. 
Instantiation of persistent objects is then performed. in two 
stages: First, the data is brought into the cache, then the ob- 
jects are instantiated from the cache upon demand. Leaving 
the data in a form that is smaller than a fully instantiated ob- 
ject saves space as well. 

The key to implementing the read-ahead feature is to extend 
the caching scheme to include the relationship semantics of the 
underlying database/ Database queries have fixed access paths 
that may differ from the object model navigation order. There- 
fore, the data in the cache must be organized in a fashion that 



allows dynamically composing any access paths defined in the 
database. In the case of relational databases, this means that 
the foreign- key references are extracted from the result set and 
maintained in a structured data cache. Figure 8 shows a struc- 
tured data cache. 

Registry 

To guarantee the uniqueness of the objects within the appli- 
cation's memory, each instantiated persistent object must be. 
registered into a centralized registry. The objects are usually 
identified in the registry using their persistent object identi- 
fiers; see Figure 9. 1 

As Figure 10 illustrates, when an object is retrieved using its 
object identifier the registry is searched first, then the data cache, 
and finally the database. The registry can be global if it is im- 
plemented using weak pointers, because objects are automati- 
cally removed from the registry when other objects no longer 
reference them. However, if weak pointers are not available, the 
registry must be localized. For example, transactions provide a 
good scope for local registries. 
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Qu ries 

From the persistence framework's point of view, queries are the 
behavior of persistent objects on their target database. Query in 
this context means any operation supported by the target database 
and executed by the persistence framework. This includes ba- 
sic create, read, update, and delete operations; inquiries (does 
an entity exist in the database, the sum of a set of columns); 
and specific operations defined by a particular database server 
such as "balance the account.'' 

Invocation differences between different target datastores in- 
clude details such as native query representation, error handling, 
and result data interpretation and processing. The native query 
representation typically can be strings (as with dynamic SQL), 
host variables (static SQL, stored procedures), or records (main- 
frame messaging). 

Encapsulating the native query details within query objects 
can standardize target database invocation. For instance, an ob- 
ject application would never know whether the query object 
contains a SQL string, or invokes a stored procedure or a mes- 
sage to a mainframe transaction-processing monitor. Figure 11 
presents two sets of encapsulated queries targeting two differ- 
ent types of datastores. 

Queries can be grouped into two broad categories— write 
queries (SQL insert, update, and delete, for example) and read 
queries (SQL select). ; ; ■ : .- ? .i;-r rl . 

Input for write queries can be either keys/for instance, delete 
an object based on its key) or full objects (insert an object); ei- 
ther of which can be collections. Queries targeting relational 
databases operate on a single object. Queries targeting stored 
procedures or mainframe transaction-processing monitors usu- 
ally take multiple objects as input parameters. 

Write queries extract the data from persistent objects and con- 
vert it to the target database form: Depending on the datastore, 
the data is placed into a query string, a query's host variables, 
or a record structure. In the case of nested records (mainframe 
messaging), the data may also need to be recomposed accord- 
ing to the nesting structure; see Figure 12. 

Because relational write queries can operate only on one 
object at a time, the number of database roundtrips within a 
complex transaction often becomes high. A useful performance 
optimization is to group the native queries together, then send 
them to the database as one package at the end of the trans- 
action. Many relational databases support this kind of "batch" 
behavior. For procedure calls this is the typical mode of op- 
eration. 

Read queries fall into two categories— those that have no 
scope limiting conditions ("all instances" queries, for exam- 
ple) and those that require parameters for search conditions 
("finder" queries). Read queries that require parameters must 
address the same data conversion and recomposition issues 
as the write queries. 

Restructuring the resulting data is necessary when the data is 
not shaped along object lines and/or the result contains data for 
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Figure 9: Making objects unique using a registry: 
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riiore than one kind of object. For example, queries involving 
certain inheritance strategies or reading ahead trees of objects 
require joins and unions that result in tuples containing data for 
multiple objects. A useful abstraction for the result processing 
is a data extractor. The data extractor contains ail the necessary 
logic to extract, convert, validate, 
and compose the data into a form 
suitable for the target persistent 
object. In case of relational joins, 
the extraction logic must also elim- 
inate redundant entries in the re- 
sult set; see Figure 13- 

To optimize the number of 
database roundtrips, the read 
queries need to be capable of 
loading trees of objects rather than 
reading one object at a time. The 
required native operations for re- 
lational queries are equijoin for 
loading chains of objects, unions 
and set differences for loading trees, 
and left- outer- joins for loading 
trees that allow missing leaves. 

Associations 

Describing the associations be- 
tween object classes is an essential element of object modeling 
and design. UML and other object modeling methodologies pro- 
vide ways of defining the semantics of associations in terms of 
their cardinality and navigability. 

The behavior of associations can be fairly complex. The im- 
plementation details can be hidden behind accessor methods 
(get methods). Accessors for one-to-one associations return the 
member object of the association. An accessor for a one- to- many 
association returns a collection of member objects. Another ap- 
proach (see Figure 14) is to implement associations as first-class 
objects (in- place association instances, proxies): 

At run time, the object referential integrity should be main- 
tained according to the semantics specified in the objects mod- 
el, while allowing the application programmer the easiest and 
most flexible interface to the relationships; Mutators (set meth- 
ods,, for example), tod collection add/remove methods should 
automatically invoke the appropriate referential integrity main- 
tenance behavior, such as updating the inverse association. 

Associations are especially important for persistent objects 
mapped to relational databases because associations can also 
provide automatic means for maintaining the database key ref- 
erential integrity. When connecting persistent objects, the asso- 
ciation will determine which persistent object holds the foreign 
key and update it appropriately with the primary key of the 
other object. Manually coding the database key maintenance 



is error prone arid can easily lead to unmaintainable code. 
Figure 15 illustrates automatic maintenance of object and 
database key referential integrity. In this example, an employee 
object is automatically removed from its old department when 
the object is added to a new department. Also, the inverse re- 
lationship from the employee to 
the department is updated au- 
tomatically. 

Associations provide a seman- 
ticaliy meaningful way for con- 
trolling the retrieval of objects 
from the database. As the appli- 
cation traverses associations, the 
related objects can be retrieved 
accordingly. Depending on the 
association, it is sometimes also 
desirable that traversal of one as- 
sociation triggers the retrieval of 
an entire graph of related objects. 
a^HBMi However, this kind of object 

graph read-ahead behavior re- 
quires advanced querying and 
caching techniques as described 
in the previous sections. 

Translation from the object as- 
sociations to the native database 
relationships may be very complex (see Figure 16). Simple re- 
lationship between two classes often translates to multiple rela- 
tionships between multiple tables when inheritance is involved. 

Transactions 

In enterprise environments a single server application may serve 
multiple concurrent client transactions, each accessing an over- 
lapping set of objects. 

Many enterprise applications that reflect complex business 
processes (see Figure 17) also require that users can navigate 
freely between different views of the user interface, work 
with the result of uncommitted changes across views, and 
commit or cancel work that has been done on a view and 
on all subviews opened in a nested fashion. In short, the na- 
ture of complex multiuser enterprise applications requires , 
that objects can be accessed from multiple concurrent and 
nested transactions. 

To ensure the consistency of concurrently running transac- 
tions they need to be isolated from each other. The two meth- 
ods for isolating the transactions are the conflict avoidance 
scheme ("pessimistic" scheme) and the conflict detection 
scheme ("optimistic" scheme). Which one to use depends on 
the type of transaction. Transactions that have a high penalty 
for failure should do whatever possible to prevent the failure 

(continued on page 30) 
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(by explicitly. locking the resources as early as possible). With 
low penalty transactions it is often worth trading the risk of 
failure to gain efficiency by using a conflict detection scheme. 

The objects are copied from the database into the applica- 
tion's memory, where they may be held for extended periods 
of time. Therefore, the transaction isolation actually consists of 
rwo components: the. object level isolation within one applies 
tion, and the database level isolation across multiple applica- 
tions. Both isolation components address multiuser issues, be- 
cause one server application may also serve multiple clients, as 
in Figure 17. 

The conflict avoidance scheme for GUI- driven, long- mnning 
transactions is usually unacceptable from a performance per- 
spective. A conflict detection scheme where each transaction 
has a version of the concurrently accessed objects provides sig- 
nificantly better performance. However, managing multiple ver- 
sions of the same object can be fairly complex. 



One approach for implementing an object versioning mech- 
anism is to divide business objects into two parts: a wrapper 
and a version (for example, an EJBObject and an EntityBeaii). 
When any object refers to a business object, it actually refers to 
its wrapper. The wrapper delegates the method invocations to 
the appropriate version, which contains the object's business 
behavior and instance data. When a business object is first ac- 
cessed (get/ set a property) within a transaction, a new version 
of the object is added to the current transaction's local registry. 
The new version is based on the version in the parent transac- 
tion's registry. Figure 18 shows multiple object versions within 
a tree of nested transactions. 

Upon commit, the versions in a child transaction's registry are 
merged with its parent transaction's corresponding versions. If 
the transaction is a top-level transaction, the versions are also 
written into the database. The logic for detecting and resolving 
conflicts on merge is highly application dependent. The test may 
■ be as simple as comparing parent and child version numbers in 



update address set streetno=34, ". . . where custno=456 and streetno=56 



Example 1: Update statement with conflict detection predicates. 
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Figure 11: Two sets of encapsulated queries. 
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Figure 12: Recomposing object instance data according to a record- nesting structure. 
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order to determine if the parent version has been changed af- 
ter the child version was created. For more advanced application- 
dependent testing the wrapper could have a conflict resolution 
call back method. 

On rollback the child versions are simply dropped instead of 
having to restore object states in the parent transaction. After 
rollback there is no trace that ei- 
ther the child transaction or the 
child versions ever existed. 

Many relational databases pro- 
vide little support for row- level 
conflict avoidance. With most 
databases the row- level locking 
is available only in conjunction 
with cursors. However, cursors 
may be of little use for an ob- 
ject application that is accessing 
and holding onto large numbers 
of different types of objects in a mmmmma^ma^m 
random fashion. One trick for 
acquiring a row^ level lock with- 
out a cursor is to touch a corre- 
sponding row (update a column 
without changing its value, for 

instance) when an object is first accessed within a transac- 
tion. If the row is already locked, the desirable action is of- 
ten to raise an exception instead of waiting for the lock to be 
released. 

As with object level isolation, the logic for detecting and re- 
solving database conflicts is application dependent. The two 
common conflict detection methods are either to reread and 
compare the database row to the modified object, or to add col- 
lision detection predicates (a set of attributes that constitute a 
conflict) to the where clause of the database update statement. 
Example 1 demonstrates conflict detection predicates. The up- 
date statement will fail if another user has changed the street 
number from its old value. 

Rereading and comparing rows is expensive and should be 
used sparingly, because it requires multiple database 
roundtrips — locking, reading, and updating the row. On the 
other hand, the use of conflict detection predicates is 
lightweight and works fine in most situations. More sophisti- 
cated detection schemes can be composed of combinations 
of the aforementioned commands. 



Serialization has been presented 
as a method for providing simple 
object persistence 



Most commercial databases have referential integrity (RI) 
constraints for , maintaining the consistency of the database. 
These constraints require the database's store and delete op- 
erations to be executed in a specific order. This order does 
not necessarily match the order in which the objects are cre- 
ated or deleted within an object application. Furthermore, the 

database RI constraints do not 
map to the logical object asso- 
ciations in a consistent way. Rl 
rules are enforced based on the 
foreign- key references, which 
may have more than one possi- 
ble transformation when 
mapped to object associations. 
Manually coding the operation 
ordering is time consuming arid 
error prone, easily leading to un- 
maintainable code. It is prefer- 
^■■"■■■■^"i able to defer execution of the 

operations and let the transac- 
tion automatically decide the or- 
dering upon its commit. 

The ordering algorithm utilizes 
the information of how the ob- 
ject associations are mapped to the primary- key/foreign- key 
column pairs in the database, and the integrity rules defined 
for the key columns. For each object within the transaction, 
the algorithm iterates over the associations the object has with 
other objects. For each association, the algorithm. tests if the 
object has either insert precedence (if the object is to be in- 
serted) or delete precedence (if the object is to be deleted) 
over the association. If the object has a higher precedence, 
it will be moved accordingly in the transaction's participant 
list. Due to the nature of relational RI constraints, the algo- 
rithm remains fairly simple, because there cannot be circular 
constraints defined in the database (otherwise it. would be 
impossible to insert a row that has a prerequisite to its own 
prerequisite). 

API 

From the programming and maintenance point of view, the 
number of persistent constructs that appear in the application 
code should be kept as low as possible. Having a low number 
of persistence constructs introduces minimal intrusion upon. 



Query 



Data 
Cache 



0 



Data 
Cache 



O-pu 







f """ "\ 




r ~s 


Data 




Data 




Data 


Extractor 




Extractor 




Extractor 


v J 




v- J 




k ... , ) 



A1 



cm 




\ C112 




C121 




C122 





Data 
Cache 



A1 




B11 


1H 0111 




A1 


. 


811 Hi 


C1 12 [sBflEBnHEi 


A1 j^^^^ 


B12 


»m 


C121 




A1 


HIS B12 




C122 


]:■: ■)■ ■. 
1 , I ' ' 



811 



— B12 




Figure 13: Restructuring a relational result set and eliminating redundant entries. 
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the application, thus allowing the database and application to 
remain loosely coupled. This loose coupling between the 
database and application lets you design an object model that 
models the application domain as opposed to modeling the 
database design and vice versa. The persistent framework must 
be intelligent enough to perform many of the necessary per- 




Figure 14: Associations implemented as first-class objects. 



sisting processes automatically, without instruction from the 
application. Implementing persistent constructs as first-class 
objects and providing some of the persistence metainforma- 
tion at run time are two of the keys that make a successful 
persistence framework. The interfaces provided by the per- 
sistence API can be grouped into the following categories: 

• Business object interface. Protocol for accessing attributes from 
the business object. 

• Life cycle interface. Protocol for creating and destroying busi- 
ness object instances. 

• Finder interface. Protocol for finding business object instances. 

• Transaction interface. Protocol for creating, committing, and 
rolling back transactions. 

For example, the Enterprise JavaBeans (EJB) Specification de- 
fines interfaces that correspond to these categories.. The remote 
interface for entity Beans corresponds to the business object in- 
terface. The EJB home interface has the same responsibilities as 
the life cycle and finder interfaces. The transaction interface is 
provided by the UserTransaction in me Java transaction pack- 
age, which is one of the prerequisites for the EJB. 
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Figure 15: Automatic maintenance of object and key referential integrity. 




Figure 16: Complex translation from an object association to multiple database relationships. 



Transaction tx = Transaction new(); 

EmployeeHomelmpl empHome = EmployeeHomelmpl. singleton () ; 

Employee emp; ; • '"„}-. . ■- ■ 

AddressHomelmpl addrHome = AddressHomelmpl . singleton () ; ; ^ : '■■ 

Address addr: • ' V£f. 

tx.begin(); //begin a new transaction, (transaction interface) 

emp = empHome.nndByKey("1234") ; //find an employee instance (finder interface) 

addr = addrHome. create () ; * //create an address instance (factory interface) 

addr . setStreet (" 123 Somewhere Dr."); //set attributes of the address (bus. object interface) 



emp. setAddress (addr) ; 
tx. commit 0; 



//set employee's address (bus. object interface) 
//commit the changes (transaction interface) 



Example 2: Sample persistence API code. 
32 
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Figure 17: A typical system configuration in an enterprise 
environment. 




Figure 1R Multiple object versions within a tree of nested 
transactions. 

Example 2 demonstrates the use of the persistence API by re- 
trieving an employee object, creating an address object, associ- 
ating these two objects together, and committing the changes 
to the database. 

C nclusion ^ 

The rationale for building an object persistence framework are, 
of course, increased productivity and reduced maintenance costs. 
Independence between object applications and databases allows 
enterprises to develop and maintain more complex applications 
and still leverage existing data management infrastructures. 

Implementing a full-blown object persistence framework; 
easily represents several years worth of work. The more flex- 
ibility and performance that is required from the framework, 
the more complex the framework becomes. Yet almost any 
framework is better than no framework. Even a simple frame- 
work can help in structuring the code in a clean and logical 
way. For example, the mapping metainformation can implic- 
itly be represented as inlined code and the query objects can 
encapsulate handcrafted SQL strings. The areas worth spend- 
ing more time in creating generic components, however, are 
the associations and the transactions because they have a di- 
rect impact on the application programming model. There are 
also several commercial object persistence frameworks avail- 
able that are usually a viable alternative to in- house devel- 
opment, especially when the target application is complex and 
critical to the enterprise. 
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